home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
Celestin Apprentice 7
/
Apprentice-Release7.iso
/
Demos
/
Component Software
/
FileFlex 2.0.3.sit
/
FileFlex 2.0.3
/
FileFlex Docs
/
Inside FileFlex (RTF)
/
FileFlex WorldFlex Update
next >
Wrap
Text File
|
1996-07-21
|
40KB
|
889 lines
Update v2.0.1: FileFlex International WorldFlex Functions
* Understanding Character-Level Sort Order
* Custom Character Sort Orders
* Creating a Single-Byte Custom Sort Order Table
* Creating Single-Byte Sort Order Utility Scripts
* Understanding Double-Byte Sort Order Tables
* Creating Double-Byte Sort Order Tables
* Tricks with Sort Order
* Setting the Sort Order with FileFlex
* Character Translation
* Creating Character Translation Utility Scripts
* Translating Characters Using FileFlex
* Case Translation
* Creating Case Translation Utility Scripts
* Intelligent Case Conversion Using FileFlex
* Standalone Intelligent Case Conversion Function
FileFlex is used within multimedia productions throughout the world. While
standard ASCII is prevalent, it is certainly not ubiquitous. When dealing
with international languages, it's necessary to account for differences in
character sorting order, for differences in case conversion, for differences
in character values, and for double-byte characters.
FileFlex new WorldFlex technology now gives you the ability to build
international flexibility into your applications with unprecedented power.
FileFlex' WorldFlex technology gives you true dynamic localization. Unlike
virtually all other so-called "world-aware" implementations, you're not
forced to rely on a particular operating system revision or a
country-nationalized version of an application. FileFlex allows you to
define your own international conversion tables and apply them on-the-fly to
any data management task. This dynamic localization functionality allows you
to switch languages, character sets, sort orders, and conversions at any
time throughout the operation of your multimedia production instantly, with
virtually no impact upon FileFlex' already blazing performance.
FileFlex WorldFlex' technology falls into these three broad categories:
* Dynamic character-level sort order: FileFlex allows you to use indexes
and queries that dynamically switch between sort-order tables. Finally,
an accented "a" character is treated like a regular "a", rather than
something from Mars. Sort orders can be specified for either
single-byte or double-byte languages.
* Character translation: As many FileFlex users have discovered, the
special diacritical characters have different values between Macintosh
and Windows, and even between DOS and Windows. FileFlex allows you to
convert characters so that all the diacritical marks (and any other
conversions you may need) are all in the right places and your
characters look just right.
* Case conversion: Normal case conversion routines apply a simple
heuristic to determine the upper case value of a character. Converting
an "a" to an "A" is simply the matter of subtracting 32. But what about
converting a "u" with an umlaut to an upper case value? What about
converting vowels with accents to their equivalent upper case
characters? FileFlex provides two standalone functions that allow you
to use custom case conversion tables so that your case conversions make
sense in your language. FileFlex internal intrinsic index and query
functions also take into account custom case conversion tables so your
data can be case insensitive when desired (as opposed to case insane).
Before we proceed with details of these functions, we'd like to thank our
customers throughout the world for working with us to understand the
individual needs of different languages and customs and how those needs
apply to the authoring of multimedia productions worldwide.
Understanding Character-Level Sort Order
----------------------------------------------------------------------------
Note: The character-level sorting features in FileFlex require that you have
a measurable amount of programming expertise. These features let you modify
the very core of FileFlex data management and require both care to use and
experience to understand. If you're not a pretty advanced scripter or
programmer, you may want to find an experienced "buddy" to team up with
before attempting to utilize these powerful capabilities.
----------------------------------------------------------------------------
FileFlex uses index files to sort information. When you create an index
file, you're choosing a field that will determine the sort order of the
database. For example, you might choose to sort on zipcode (a numeric code
in the US that helps the post office tell where to deliver mail--in other
countries this is often called the postal code), meaning that records
containing 08553 in the zipcode field will be earlier in the database than
records with 94404 in the zipcode field. Likewise, if you chose to organize
your data based on last name, then "Clinton" would come before "Kennedy".
When you switch indexes, FileFlex doesn't reorder the entire database of
records. Instead it adopts a different sort order based on the data in the
fields. FileFlex creates the order of information in an index file when
DBCreateIndex is called. It maintains and updates that order of information
as part of the process of writing a record.
When FileFlex updates an index file, it's comparing the values in two
different records. When it looks at "Clinton" and "Kennedy", it looks at the
first characters (i.e., "C" and "K") and determines that "C" comes before
"K" and therefore "Clinton" comes before "Kennedy".
This comparison of "C" vs. "K" is based on the standard ordered table we
call ASCII (American Standard Code for Information Interchange). When
FileFlex compares "C" against "K", it's really getting the ASCII value of
"C" (67 decimal) and comparing it to the ASCII value of "K" (75 decimal).
Since 67 comes before 75, then "C" comes before "K".
Note: Character sorting is case sensitive. A lower case "c" is ASCII 99
while an upper case "C" is ASCII "67". If you were to compare "clinton"
(note the lower case "c") against "Kennedy", "Kennedy" would come first
because of the ASCII value of "K" (ASCII 75) is less than that of lower case
"c".
So, when FileFlex looks at "CLINTON" and "KENNEDY", it's really looking at
the comparative weights (or priorities) of the individual characters,
according to their representation in ASCII. Here's the two strings and their
corresponding values:
C L I N T O N
67 76 73 78 84 79 78
| | | | | | |
75 69 78 78 69 68 89
K E N N E D Y
Custom Character Sort Orders
FileFlex' new WorldFlex technology allows you to customize the
character-level sort order used by the FileFlex indexing routines. There are
two primary reasons you might want to do this:
* To sort in descending rather than ascending order
* To sort according to sorting rules different than ASCII, in particular
for languages other than English.
In fact, a very important part of WorldFlex technology is the ability to
change the sort order of your characters, and thereby sort your database
according to the sorting rules you feel are currently appropriate.
Many so-called "internationalized", "localized", or "world-aware" systems do
provide support for character sorting order for multi-country use. But they
are usually available only when you're running the localized version of the
operating system or database application. While many of orur friends outside
the US are grateful for any mechanism that recognizes their native language,
FileFlex doesn't stop there. FileFlex' new WorldFlex technology is vastly
more powerful. FileFlex allows you to change your sorting order on-the-fly,
as you switch index files. Nothing else can do this!
Here's an example of where this is so powerful: Imagine you're a
multi-national firm with customers throughout the world. When you do a query
to list your customers in the US, the ASCII sort order is just fine. But
when you do a query to list customers in Japan, you want the customers'
names sorted by the appropriate sorting conventions for the Japanese
language and character sets--not according to the rather provincial
expectations of ASCII. With FileFlex, you can switch from an ASCII index to
an index ordered according to Japanese sort order absolutely instantly.
Creating a Single-Byte Custom Sort Order Table
Character sort orders are controlled by a custom sort order table. For
applications and languages that use single-byte characters (typically,
"roman" languages), each character can be represented by a single byte.
Since a byte is 8-bits wide, this allows for 256 characters.
You create a sort order table in your host development environment's
programming language (our examples will be in Director's Lingo). We do this
by building a table containing three bytes of data for each character in the
sort order:
* Leader Flag Byte: For single-byte languages, this byte is always set to
255.
* Priority Multiplier Byte: For single-byte languages, this byte is also
always set to 255.
* Priority Value Byte: This value signifies the priority of the character
in the list (never use 0).
At the end of all of the three-byte sets, a single byte containing the value
0 is used to terminate the table.
Before we look in more detail at the Priority Value Byte, let's first look
at how ASCII prioritizes it's characters:
A B C D E ... V W X Y Z
65 66 67 68 69 ... 86 87 88 89 90
Since "A" is an ASCII 65, it's got a lower value than "D", which is an ASCII
68. The numbers 65 and 68 correspond to the priority value of the various
letters. Likewise, in FileFlex' custom sort order tables, the lower priority
value number, the earlier in the sort the character will be placed. If we
wanted to sort in reverse order ("Z" before "A"), we could assign different
priority values, giving "Z" a much lower number than "A", as in the
following list:
Z Y X W V ... E D C B A
65 66 67 68 69 ... 86 87 88 89 90
With the priorities show above, if we looked up a "D", we'd see it's value
was 87. Since an "A" has a priority value of 90, the "D" would come earlier
in the list. If we used this set of priority values, "KENNEDY" would
certainly appear before "CLINTON".
It's important to remember that the priority value is entirely up to you. If
you wanted all words with vowels (A, E, I, O, and U) to come at the
beginning of the list, you might create the following table of priority
values:
A E I O U B C D F G H J K L
65 66 67 68 69 70 71 72 73 74 75 76 77 78 ...
FileFlex determines where in the sort order table to find a priority value
based on the character's actual computer-code value (usually ASCII). So,
since "A" has the ASCII code value of 65, FileFlex will look in the 65th
entry in the sort order table to retrieve the priority value. Let's make
this a bit clearer by constructing a partial sort order table for
traditional ASCII (note, we're showing all three data bytes as described
above and all numbers are in base-10):
Entry Pos 65 66 67 68
US Char "A" "B" "C" "D"
Data Bytes 255 255 065 255 255 066 255 255 067 255 255 068
Entry Pos 69 70 71 72
US Char "E" "F" "G" "H"
Data Bytes 255 255 069 255 255 070 255 255 071 255 255 072
Entry Pos 73 74 75 76
US Char "I" "J" "K" "L"
Data Bytes 255 255 073 255 255 074 255 255 075 255 255 076
So, to create a FileFlex sort order table that matches traditional ASCII in
ascending order, you'd want "A" to have a sort order priority of 65, so the
third data type at position 65 would be the value 65.
Now let's look at how the table would change if we wanted to sort everything
in reverse order (note that we've reversed the entire ASCII character set):
Entry Pos 65 66 67 68
US Char "A" "B" "C" "D"
Data Bytes 255 255 190 255 255 189 255 255 188 255 255 187
Entry Pos 69 70 71 72
US Char "E" "F" "G" "H"
Data Bytes 255 255 186 255 255 185 255 255 184 255 255 183
Entry Pos 73 74 75 76
US Char "I" "J" "K" "L"
Data Bytes 255 255 182 255 255 181 255 255 180 255 255 179
Using the above table, when FileFlex encounters the character "A", which has
the ASCII value of 65, it looks at the 65th entry in the table. It then
retrieves the priority value, which is 190. If FileFlex then looks for "C"
(in the 67th entry in the table), it retrieves the priority value of 188.
Since 188 is less than 190, FileFlex will put "C" before "A".
Creating Single-Byte Sort Order Utility Scripts
The best way to create the sort order table is to write a simple utility
script. Here's an example script that simply builds the ASCII order in ASCII
order:
on buildSortOrder_ASCII
global ASCII
put "" into theTable
repeat with i = 0 to 255
put the number of chars of theTable into theChar
put numToChar(255) after theTable -- no leader char
put numToChar(255) after theTable -- priority multiplier of 0
if i = 0 then
put numToChar(255) after theTable -- use 255 in byte 0
else
put numToChar(i) after theTable -- priority value
end if
end repeat
put numToChar(0) after theTable -- terminator byte code
put theTable into ASCII
end buildSortOrder_ASCII
Note the name of the handler is "BuildSortOrder_ASCII". We've developed a
convention where the routine that builds the sort order is called
"BuildSortOrder_" and the name of the sort order itself is appended to the
end. The sort order table is placed in a global variable of the same name.
So, for a sort order for French Canadian, we recommend naming the handler
"BuildSortOrder_FrenchCanadian" and the global variable containing the sort
order "FrenchCanadian".
Note that the routine above places the actual byte value into the string by
using numToChar(x). This places a single byte value corresponding to the
number in the string location. Each set of data bytes in the table gets two
bytes with 255 (for the leader char and priority page 0), and the byte
corresponding to the priority value. Finally, after all the data byte sets
are added to the string, BuildSortOrder_ASCII appends a terminator byte
(value 0).
Here's an example routine that reverses the ASCII sort order, placing the
table in the global ASCIIReverse:
on buildSortOrder_ASCIIReverse
global ASCIIReverse
put "" into theTable
put 255 into priority
repeat with i = 0 to 255
put the number of chars of theTable into theChar
put numToChar(255) after theTable -- no leader char
put numToChar(255) after theTable -- priority multiplier of 0
if i = 0 then
put numToChar(255) after theTable -- use 255 in byte 0
else
put numToChar(priority) after theTable -- priority value
end if
put priority-1 into priority
end repeat
put numToChar(0) after theTable -- terminator byte code
put theTable into ASCIIReverse
end buildSortOrder_ASCIIReverse
----------------------------------------------------------------------------
WARNING: Make absolutely certain you end each sequence with a numToChar(0)
terminator byte. Failure to do this could cause FileFlex to scan beyond the
end of the sort order table and the results could be unpredictable and your
program could abnormally terminate.
----------------------------------------------------------------------------
Understanding Double-Byte Sort Order Tables
If the language you're sorting uses double-byte characters (like certain
Japanese and Chinese character sets), you'll need to create double-byte sort
order tables. Double-byte character sets are different because they use two
bytes for many characters. The computer distinguishes between a standard
single-byte character and a dual-byte character by the existence of a leader
byte. This leader byte tells the computer that the byte that follows the
leader byte is to be treated as a special character, rather than simply part
of the standard ASCII table.
FileFlex sort order tables are not limited to 256 bytes. Instead, they can
be anywhere from 256 bytes long to 65,280 bytes long (255 * 256). Each set
of 256 bytes in the sort order table is called a "sort order page" and the
maximum number of sort order pages allowed by FileFlex is 255.
If you recall from earlier, each character value is represented in the sort
order table by three bytes, a leader char byte, a priority multiplier byte,
and a priority value byte. Also, if you recall, the leader char byte for
single-byte sort order tables was always 255. That told FileFlex to look in
the very first page of the sort order table (i.e., the very first set of 256
bytes) for the character's priority value.
When you're using double-byte character sets, you'll need more than one
256-byte page to represent the sort order. The value that's placed in the
leader character tells FileFlex in which sort order page to look for the
priority value of the character which follows the leader character. Let's
diagram that out:
Suppose that your language character set uses characters with the value of
128 as a leader character. Now, let's suppose your database has a
double-byte character with the values 128 and 065 respectively for the two
bytes. Here's how the sort order table might be be defined:
Sort order page 0
---------------------------
Position #128: 001 255 255
Sort order page 1
---------------------------
Position #65: 255 255 015
When reading the character stream, FileFlex would read the first byte and
determine it's value was 128. It would then go to position 128 in the sort
order table and read the first byte. Since the first byte (the leader byte
flag) is not a 255, it would know that 128 was a leader byte. Since the
leader byte flag is 1, FileFlex would know that the next character retrieved
should be compared against sort order page 1 (located in the second bank of
256 bytes).
FileFlex would now read the second byte of the character. Since it knows
that this character is the second of a double-byte character set, FileFlex
will then determine the character's value (in this case 65) and jump 65
bytes into the second sort order page (or to byte 321...256+65...of the full
sort order table). 321 bytes into the table (position 65 in the second page)
FileFlex would look at the priority value byte and determine that the
priority of the character represented by 128 065 is 15.
Creating Double-Byte Sort Order Tables
You create a double-byte sort order table very much like you would a
single-byte table. You create sets of three-byte sequences for each
character. For each sort order page, you create 256 of these three byte
sets. At the very end, you place a single byte value of 256 that signifies
the termination of the table.
You should probably lay out the sort order tables on paper before you
attempt to write the code to generate a table.
First, you should determine those byte values that are leader bytes. For
every unique leader byte value, assign a sort order page, from page 1 to
254. Obviously, you want to keep the number of absolute sort order pages
down as much as possible to make things run faster and to use less memory.
For each leader byte in the sort order byte triplet, make sure you've set
the following two bytes to 255.
Next, fill in all the other remaining values in the first 256 byte page. For
each character, assign a weighted value and place that in the third byte of
the data triplet.
Note: you can use the second byte of the data triplet as a priority
multiplier. If you need priorities higher than 255, use the priority
multiplier byte by setting it to anything between 1 (earliest in the
priority order) to 254 (last in the priority search list order).
After you've filled in the first sort order page, you can then create the
subsequent pages. In these pages, the first byte of the triplet will always
be 255, the second byte between 1 and 254 depending on your desired priority
multiplier, and the third value byte also between 1 and 254.
Finally, append a terminator byte--which needs to be a charToNum(0) value.
Once you've layed all this out on paper, you can write a BuildSortOrder_
routine that will create a global variable containing your sort order.
Tricks with Sort Order
You can do some pretty interesting things with sort orders besides handling
international issues. For example, lets assume you wanted to sort numerical
data which you stored in a character field.
Note: You should generally do this because the DBF format stores numbers as
ASCII values internally. But if you use character fields to store numbers,
you get to manipulate values with more control (i.e., sort order).
So, again, let's assume you've got a character field containing numeric
data. Sometimes, in a numeric field, you might want to have spaces or
asterisks instead of zeros, like in the following example:
"0002598"
" 2598"
"***2598"
When creating a custom sort order table for numerical sorts in character
fields, you can give the space character (ASCII 32), the asterisk character
(ASCII 42), and the zero all the same priority value weighting. This would
cause the sorting/seeking routines to treat all three characters the same.
This kind of "equalizing" of sorting values also applies to those special
international characters, like letters with umlauts (e.g., the double-dots)
or accent marks over characters. You might want to treat a lower case 'a'
and a lower-case 'a' with an accent mark as the same character in sort
order.
You can also do this with upper and lower case values. If you want upper
case and lower case letters to be sorted together, give them the same
priority value.
Setting the Sort Order with FileFlex
You can tell FileFlex to use a new sort order with the FileFlex command
DBSetSortOrder. Unlike most FileFlex commands, DBSetSortOrder is a wrapper
script that does not call FileFlex directly. Instead, DBSetSortOrder sets
two FileFlex global properties: gDBWorldSort and gDBSortOrder.
Note: I almost named the gDBSortOrder variable gDBWorldOrder. Then the
function would have been DBSetWorldOrder. But that seemed far too
Republican, so I restrained myself. Wouldn't it be great if you could write
a new translation table, give a quick call to DBSetWorldOrder, and--poof--a
new world order emerges? It gives new (and terrifying meaning) to the phrase
"FileFlex users rule!" [chuckle] [[shiver]].
Here's the Lingo code for DBSetSortOrder:
on DBSetSortOrder order
global gDBWorldSort
global gDBSortOrder
if order = EMPTY then
put EMPTY into gDBWorldSort
else
put "1" into gDBWorldSort
put order into gDBSortOrder
end if
return 0
end DBSetSortOrder
When you call DBSetSortOrder, you want to pass your sort order table. Here's
an example:
put DBSetSortOrder(ASCII) into DBResult
To disable custom sort order processing, set the sort order to the empty
string:
put DBSetSortOrder("") into DBResult
Inside of FileFlex is a C++ function called worldCompare(). When a
DBCreateIndex or DBSeek command is executed, at some time, the internal
worldCompare routine is called upon to compare two strings. When
worldCompare is called, it asks the host development environment (i.e.,
Director) for the value of the reserved global variable gDBWorldSort. If
worldCompare discovers that gDBWorldSort is not empty, it then asks the host
environment for the contents of the global variable gDBSortOrder and uses
that to control the comparison of two strings.
Hint: One of the reasons building a sort order table is so complex and
precise is you're building an actual binary data structure that FileFlex can
use directly. While the table may be a bit painful to design once, this
mechanism allows FileFlex to do custom comparisons and switch sort order
tables at blinding speed.
To turn off a sort order table, send the empty string to DBSetSortOrder.
When this happens, the global gDBWorldSort is set to the empty string.
FileFlex then knows to skip the extra processing inherent in comparing
world-aware data strings.
Cautions: The sort order impacts the internal compare functions; it does not
reorder the dataset or the index. As a result, you should set your sort
order BEFORE you call DBCreateIndex and you should always use the
appropriate sort order table when doing a DBSeek or DBSelectIndex. Failure
to do this could cause your data to appear out of order. When writing
records, try not to get in the situation where two different sort orders
need to be active when writing one record.
Here's a sample script from the Sort Order demo file:
on mouseUp
global ASCIIReverse -- the reverse sort order table
-- initialize FF session
put DBOpenSession() into dbresult
if dbResult < 0 then
alert "FileFlex could not initialize!"
exit
end if
-- open a database file
put dbUse(field "theDBFile") into dbID
if dbID < 0 then errorClose "Could not open database file."
--
-- create a a custom index on TITLE using ASCIIReverse
--
buildSortOrder_ASCIIReverse -- build the sort order
put DBSetSortOrder(ASCIIReverse) into dbResult
put "Creating index file..." into field "status"
updateStage
put dbCreateIndex("REVASCII","TITLE","0","0") into ndxID
if ndxID < 0 then errorClose "Could not create index file."
-- fill the list
put "Scanning data file..." into field "status"
updateStage
put DBSelectIndex(ndxID) into dbResult
if dbResult < 0 then errorClose "Could not select index file."
put "" into theList
put DBTop() into dbResult
repeat while 1 = 1 -- forever
if theList <> "" then put return after theList
put DBGetFieldByName("TITLE") into title
updateStage
put title after theList
if DBSkip(1) = 3 then exit repeat
end repeat
put theList into field "movie list"
updateStage
put DBSetSortOrder(EMPTY) into dbResult -- turn off
put DBCloseSession() into dbresult
if dbResult < 0 then
alert "FileFlex could not terminate!"
exit
end if
put "Processing complete..." into field "status"
updateStage
end
on errorClose s
alert s
put DBCloseSession() into dbresult
if dbResult < 0 then
alert "FileFlex could not terminate!"
abort
end if
abort
end errorClose
----------------------------------------------------------------------------
Important: FileFlex uses the xBASE/dBASE III standard format. This format
does not permit 8-bit deep characters in memo fields contained within DBT
files. Attempting to do character translation to characters greater than 128
can cause this format difficulties. If you need to store non-ASCII text in
memo fields, you should either use a custom translation table or store your
data in text files and refer to those files from FileFlex fixed-length
fields.
----------------------------------------------------------------------------
Character Translation
If you're using a language that has special characters in it's character
sets (i.e., accent marks, umlauts, and other specialty characters), you may
run into an interesting problem moving documents from Macintosh to Windows
or vice-versa. That's because while ASCII is cleanly defined for the US
English character set of "a-zA-Z", that does not mean that character values
of special characters are uniformly used across platforms.
FileFlex user Antonio Lucena of Madrid, Spain describes the conversion issue
as it pertains to DOS vs. Windows files as well:
"The problem is that Windows uses different character set than MS-DOS (and
the databases created with dBASE). MS-DOS uses OEM Char set, and Windows
uses ANSI. For example in OEM, a diacritical "e" is numbered 130, but in
ANSI, same "e" is numbered 233. The same problem appears when you open a
document (with diachitical vowels on it) made with the EDIT tool from MS-DOS
and you try to open it with the WRITE tool from Windows and no previous
conversion was made."
Note: The above message illustrates the value of the free fileflex-talk
mailing list. Another user had discovered the translation problem and by
asking questions to this user and making that dialog public via
fileflex-talk, Antonio was able to see the message and contribute his
feedback. With feedback from him and others, we were able to identify the
need for the new DBTranslateChars function described below.
FileFlex WorldFlex technology provides for character-level translation using
much the same mechanism as used for developing sort order tables. You
develop a translation table that describes the new and old values and pass
it to FileFlex along with a container of characters to be translated.
Setting up a character translation table is very straightforward. You need
to build a Lingo string consisting of 256 characters. The position in the
string is the value of the old character and the value at that position
becomes the new character.
Note: The first character in the string is considered "position 0" by
FileFlex. Also note that you cannot place a 0 into any character position.
If you do not want translation, place the corresponding character value into
that position or the value 255.
Creating Character Translation Utility Scripts
The best way to create the character translation table is to write a simple
utility script. Here's an example script that simply contains the ASCII
character set:
on buildTranslateTable_ASCIIX
global ASCIIX
put "" into theTable
repeat with i = 0 to 255
if i = 0 then
put numToChar(255) after theTable -- use 255 in byte 0
else
put numToChar(i) after theTable -- position in table
end if
end repeat
put theTable into ASCIIX
end buildTranslateTable_ASCIIX
Note the name of the handler is "BuildTranslateTable_ASCIIX". We've
developed a convention where the routine that builds the translation table
is called "BuildTranslateTable_" and the name of the translation itself is
appended to the end. In order to prevent confusion from sort order tables,
we've also placed an X after every translation table ("X" for an often used
abbreviation for translate, which is "Xlate"). The translation table is
placed in a global variable of the same name. So, for a translation table
that converts to Windows diacriticals, we recommend naming the handler
"BuildTranslateTable_WinCharX" and the global variable containing the sort
order "WinCharX".
Here's an example routine that converts upper case to lower case (and the
reverse):
on buildTranslateTable_CaseReverseX
global CaseReverseX, ASCIIX
buildTranslateTable_ASCIIX
put ASCIIX into theTable
-- fill in lower case
repeat with i = 65 to 90
put numToChar(i+32) into char i+1 of theTable
-- using i+1 above because strings begin at 1, not 0
end repeat
-- fill in upper case
repeat with i = 97 to 122
put numToChar(i-32) into char i+1 of theTable
end repeat
put theTable into CaseReverseX
end buildTranslateTable_CaseReverseX
The above routine reverses the case, so an upper case "A" becomes a lower
case "a" and vice versa. To create a routine that always converts to upper
case, make both sets of characters upper case. Likewise, to create a routine
that always converts to lower case, make both sets of characters lower case.
Here's an UpperX routine:
on buildTranslateTable_UpperX
global UpperX, ASCIIX
buildTranslateTable_ASCIIX
put ASCIIX into theTable
-- fill in upper case
repeat with i = 97 to 122
put numToChar(i-32) into char i+1 of theTable
-- using i+1 above because strings begin at 1, not 0
end repeat
put theTable into UpperX
end buildTranslateTable_UpperX
----------------------------------------------------------------------------
WARNING: Make absolutely certain you fill in all 256 bytes. Failure to do
this could cause FileFlex to scan beyond the end of the translation table
and the results could be unpredictable and your program could abnormally
terminate.
----------------------------------------------------------------------------
Translating Characters Using FileFlex
You can use FileFlex to translate character sets within a text container
using the DBTranslateChars function. DBTranslateChars takes two parameters:
the string to be translated and the pre-built translation table described
above. It returns the translated string:
put DBTranslateChars(myString,CaseReverseX) into newString
Here's a sample routine that will do the character translation (it
presupposes that FileFlex has been initialized properly with DBOpenSession):
on mouseUp
global CaseReverseX
buildTranslateTable_CaseReverseX
put DBTranslateChars(field "text data",CaseReverseX)
into field "text data"
end mouseUp
Case Translation
If you're using a language that has special characters in it's character
sets (i.e., accent marks, umlauts, and other specialty characters), you may
run into an interesting problem converting between upper and lower case.
With standard ASCII, it's easy to do a case conversion: just add or subtract
32 to the character's value. That's because in ASCII, the upper or lower
case character is always algorithmically deterministic. However, when
dealing with international character sets where lower case characters might
have diacritical marks, it becomes much harder. That's because the
characters have a wide variety of values and because there is little
standardization.
FileFlex WorldFlex technology provides for intelligent case translation
using much the same mechanism as used for developing character translation
tables. You develop a translation table that describes the new and old
values and pass it to FileFlex along with a container of characters to be
translated.
You'll need to set up two case translation tables; one going to upper case
and one going to lower case. For each table, you must build a Lingo string
consisting of 256 characters. The position in the string is the value of the
old character and the value at that position becomes the new character.
Note: The first character in the string is considered "position 0" by
FileFlex. Also note that you cannot place a 0 into any character position.
If you do not want translation, place the corresponding character value into
that position or the value 255.
Creating Case Translation Utility Scripts
The best way to create the case translation table is to write a simple
utility script. Here's an example script that simply converts ASCII lower
case to ASCII upper case:
on buildCaseTable_AsciiUC
global AsciiUC
put "" into theTable
-- Although it takes a few extra cycles, consider
-- building a full table first, then modifying it below.
-- This is much easier to understand and test.
repeat with i = 0 to 255
if i = 0 then
put numToChar(255) after theTable -- use 255 in byte 0
else
put numToChar(i) after theTable -- position in table
end if
end repeat
-- fill in upper case
repeat with i = 97 to 122
put numToChar(i-32) into char i+1 of theTable
-- using i+1 above because strings begin at 1, not 0
end repeat
put theTable into AsciiUC
end buildCaseTable_AsciiUC
Note the name of the handler is "BuildCaseTable_AsciiUC". We've developed a
convention where the routine that builds the translation table is called
"BuildCaseTable_" and the name of the translation itself is appended to the
end. In order to prevent confusion with other tables, we've also placed an
UC after every translation table (for translation to upper case--use "LC"
for translation to lower case). The upper case table is placed in a global
variable of the same name.
Here's the routine that translates back down to lower case:
on buildCaseTable_AsciiLC
global AsciiLC
put "" into theTable
-- Although it takes a few extra cycles, consider
-- building a full table first, then modifying it below.
-- This is much easier to understand and test.
repeat with i = 0 to 255
if i = 0 then
put numToChar(255) after theTable -- use 255 in byte 0
else
put numToChar(i) after theTable -- position in table
end if
end repeat
-- fill in lower case
repeat with i = 65 to 90
put numToChar(i+32) into char i+1 of theTable
-- using i+1 above because strings begin at 1, not 0
end repeat
put theTable into AsciiLC
end buildCaseTable_AsciiLC
----------------------------------------------------------------------------
WARNING: Make absolutely certain you fill in all 256 bytes. Failure to do
this could cause FileFlex to scan beyond the end of the translation table
and the results could be unpredictable and your program could abnormally
terminate.
----------------------------------------------------------------------------
Intelligent Case Conversion Using FileFlex
Case translation is used in a number of important ways within FileFlex, in
particular within the intrinsic functions used in indexes and queries, and
through special utility functions provided to perform simple case
conversion.
You can tell FileFlex to use a case translation table with the FileFlex
command DBSetCaseTables. Unlike most FileFlex commands, DBSetCaseTables is a
wrapper script that does not call FileFlex directly. Instead,
DBSetCaseTables sets three FileFlex global properties: gDBWorldCase,
gDBWorldUpper and gDBWorldLower.
Here's the Lingo code for DBSetCaseTables:
on DBSetCaseTables upperTable, lowerTable
global gDBWorldCase
global gDBWorldUpper, gDBWorldLower
if (upperTable = EMPTY or lowerTable = EMPTY) then
put EMPTY into gDBWorldCase
else
put "1" into gDBWorldCase
put upperTable into gDBWorldUpper
put lowerTable into gDBWorldLower
end if
return 0
end DBSetCaseTables
When you call DBSetCaseTables, you want to pass your case tables. Here's an
example:
put DBSetCaseTables(AsciiUC, AsciiLC) into DBResult
To disable custom case conversion processing, set the sort order to the
empty string:
put DBSetCaseTables("") into DBResult
Inside of FileFlex is a C++ function called worldUpper(). When an intrinsic
UPPER function is executed, the internal worldUpper routine is called upon
to do the case conversion. When worldUpper is called, it asks the host
development environment (i.e., Director) for the value of the reserved
global variable gDBWorldCase. If worldUpper discovers that gDBWorldCase is
not empty, it then asks the host environment for the contents of the global
variables gDBWorldUpper and gDBWorldLower and uses them to control the
conversion of the strings.
To turn off custom case conversion, send the empty string to
DBSetCaseTables. When this happens, the global gDBWorldCase is set to the
empty string. FileFlex then knows to skip the extra processing inherent in
case conversion of world-aware data strings.
Cautions: Be careful that the first parameter is the upper case table and
the second parameter is the lower case table. Also make sure you pass two
tables. Failure to pass two complete case conversion tables could cause
unpredictable results and might lead to abnormal termination.
Standalone Intelligent Case Conversion Functions
In addition to doing intelligent case conversions within index and query
functions, FileFlex provides you with the ability to do intelligent case
conversions of standalone strings.
The function DBUpper will convert a string intelligently from lower case to
upper case. If case tables have already been set with DBSetCaseTables,
DBUpper will use those tables, otherwise it will use the standard ASCII
upper case conversion. Here's how to call DBUpper:
put DBUpper(string) into newString
Likewise DBLower will convert a string intelligently from upper case to
lower case. If case tables have already been set with DBSetCaseTables,
DBLower will use those tables, otherwise it will use the standard ASCII
lower case conversion. Here's how to call DBLower:
put DBUpper(string) into newString